240 PART 5 Looking for Relationships with Correlation and Regression

Looking at Table 17-2, let’s assume that your variable names are StudyID for Par-

ticipant ID, Age for age, Weight for weight, and SBP for SBP. Imagine that you’re

planning to run a regression model with this formula (using the shorthand nota-

tion described in the earlier section “Defining a few important terms”): SBP ~ Age

+ Weight. In this case, you should first prepare several scatter charts: one of SBP

(outcome) versus Age (predictor), one of SBP versus Weight (another outcome ver-

sus predictor), and one of Age versus Weight (both predictors). For regression

models involving many predictors, there can be a lot of scatter charts! Fortu-

nately, many statistics programs can automatically prepare a set of small thumb-

nail scatter charts for all possible pairings among a set of variables, arranged in a

matrix as shown in Figure 17-1.

These charts can give you insight into which variables are associated with each

other, how strongly they’re associated, and their direction of association. They

also show whether your data have outliers. The scatter charts in Figure  17-1

indicate that there are no extreme outliers in the data. Each scatter chart also

shows some degree of positive correlation (as described in Chapter 15). In fact, if

you refer to Figure 17-1, you may guess that the charts in Figure 17-1 correspond

to correlation coefficients between 0.5 and 0.8. In addition to the scatter charts,

you can also have your software calculate correlation coefficients (r values)

between each pair of variables. For this example, here are the results: r

0 654

.

for

Age versus Weight, r

0 661

.

for Age versus SBP, and r

0 646

.

for Weight versus SBP.

FIGURE 17-1:

A scatter chart

matrix for a set

of variables prior

to multiple

regression.

© John Wiley & Sons, Inc.